At large enterprises (such as large companies, educational facilities, government agencies, and so forth), the cost of storing data is becoming an increasing concern. A large enterprise may have to provide sufficient storage resources to store relatively massive amounts of data. The costs of storing data include costs associated with the following: hardware and associated software for storage systems, office and data center real estate, information technology personnel, power resources, cooling resources, network resources for moving data, and so forth.
A good portion of data growth is attributable to duplication of data. While some duplication of data is used for purposes of data protection or caching, other types of duplication are wasteful. For example, various employees of an enterprise may individually download the same large data files to different storage locations.